AITopics | cross-lingual ability

Collaborating Authors

cross-lingual ability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Semantic Pivots Enable Cross-Lingual Transfer in Large Language Models

He, Kaiyu, Zhou, Tong, Chen, Yubo, Qiu, Delai, Liu, Shengping, Liu, Kang, Zhao, Jun

arXiv.org Artificial IntelligenceMay-23-2025

Large language models (LLMs) demonstrate remarkable ability in cross-lingual tasks. Understanding how LLMs acquire this ability is crucial for their interpretability. To quantify the cross-lingual ability of LLMs accurately, we propose a Word-Level Cross-Lingual Translation Task. To find how LLMs learn cross-lingual ability, we trace the outputs of LLMs' intermediate layers in the word translation task. We identify and distinguish two distinct behaviors in the forward pass of LLMs: co-occurrence behavior and semantic pivot behavior. We attribute LLMs' two distinct behaviors to the co-occurrence frequency of words and find the semantic pivot from the pre-training dataset. Finally, to apply our findings to improve the cross-lingual ability of LLMs, we reconstruct a semantic pivot-aware pre-training dataset using documents with a high proportion of semantic pivots. Our experiments validate the effectiveness of our approach in enhancing cross-lingual ability. Our research contributes insights into the interpretability of LLMs and offers a method for improving LLMs' cross-lingual ability.

cross-lingual ability, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.16385

Country:

Asia (0.93)
North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Analyzing the Evaluation of Cross-Lingual Knowledge Transfer in Multilingual Language Models

Rajaee, Sara, Monz, Christof

arXiv.org Artificial IntelligenceFeb-3-2024

Recent advances in training multilingual language models on large datasets seem to have shown promising results in knowledge transfer across languages and achieve high performance on downstream tasks. However, we question to what extent the current evaluation benchmarks and setups accurately measure zero-shot cross-lingual knowledge transfer. In this work, we challenge the assumption that high zero-shot performance on target tasks reflects high cross-lingual ability by introducing more challenging setups involving instances with multiple languages. Through extensive experiments and analysis, we show that the observed high performance of multilingual models can be largely attributed to factors not requiring the transfer of actual linguistic knowledge, such as task- and surface-level knowledge. More specifically, we observe what has been transferred across languages is mostly data artifacts and biases, especially for low-resource languages. Our findings highlight the overlooked drawbacks of existing cross-lingual test data and evaluation setups, calling for a more nuanced understanding of the cross-lingual capabilities of multilingual models.

computational linguistic, multilingual model, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2402.02099

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.45)

Add feedback

Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations

Ranaldi, Leonardo, Pucci, Giulia, Freitas, Andre

arXiv.org Artificial IntelligenceAug-27-2023

The language ability of Large Language Models (LLMs) is often unbalanced towards English because of the imbalance in the distribution of the pre-training data. This disparity is demanded in further fine-tuning and affecting the cross-lingual abilities of LLMs. In this paper, we propose to empower Instructiontuned LLMs (It-LLMs) in languages other than English by building semantic alignment between them. Hence, we propose CrossAlpaca, an It-LLM with cross-lingual instruction-following and Translation-following demonstrations to improve semantic alignment between languages. We validate our approach on the multilingual Question Answering (QA) benchmarks XQUAD and MLQA and adapted versions of MMLU and BBH. Our models, tested over six different languages, outperform the It-LLMs tuned on monolingual data. The final results show that instruction tuning on non-English data is not enough and that semantic alignment can be further improved by Translation-following demonstrations.

demonstration, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2024.findings-acl.473

2308.14186

Country:

Europe > Switzerland (0.14)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)
(8 more...)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Cross-Lingual Ability of Multilingual BERT: An Empirical Study

K, Karthikeyan, Wang, Zihan, Mayhew, Stephen, Roth, Dan

arXiv.org Artificial IntelligenceDec-17-2019

Recent work has exhibited the surprising cross-lingual abi lities of multilingual BERT ( M-BERT) - surprising since it is trained without any cross-lingual objective and with no aligned data. In this work, we provide a compr ehensive study of the contribution of different components in M-BERT to its cross-lingual ability. The experimental study is done in the context of three typologically different languages - Spani sh, Hindi, and Russian - and using two conceptually different NLP tasks, textual en tailment and named entity recognition. Among our key conclusions is the fact th at the lexical overlap between languages plays a negligible role in the cross-ling ual success, while the depth of the network is an integral part of it. Embeddings of natural language text via unsupervised learn ing, coupled with sufficient supervised training data, have been ubiquitous in NLP in recent years an d have shown success in a wide range of monolingual NLP tasks, mostly in English. Training models f or other languages have been shown more difficult, and recent approaches relied on bilingual em beddings that allowed the transfer of supervision in high resource languages like English to mode ls in lower resource languages; however, inducing these bilingual embeddings required some level of supervision (Upadhyay et al., 2016). Not only the model is contextual, but its training also requires no supervisio n - no alignment between the languages is done. Nevertheless, and despite being trained with no exp licit cross-lingual objective, M-BERT produces a representation that seems to generalize well acr oss languages for a variety of downstream tasks (Wu & Dredze, 2019). In this work, we attempt to develop an understanding of the su ccess of M-BERT.

b-bert, cross-lingual ability, similarity, (15 more...)

arXiv.org Artificial Intelligence

1912.0784

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Illinois > Champaign County > Urbana (0.14)
Oceania > Australia (0.04)
(6 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback